Open Access 



Research 



BMJ 




accessible medical research 



Childhood mortality in sub-Saharan 
Africa: cross-sectional insight into 
small-scale geographical inequalities 
from Census data 



Lawrence Kazembe, 1 Aileen Clarke, 2 Ngianga-Bakwin Kandala 2 ' 3 



ARTICLE SUMMARY 



To cite: Kazembe L, Clarke A, 
Kandala N-B. Childhood 
mortality in sub-Saharan 
Africa: cross-sectional insight 
into small-scale geographical 
inequalities from Census 
data. BMJ Open 201 2;2: 
e001421. doi:10.1136/ 
bmjopen-201 2-001 421 

► Prepublication history and 
additional material for this 
paper are available online. To 
view these files please visit 
the journal online 
(http://dx.doi.org/10.1136/ 
bmjopen-2012-001421). 



Received 13 June 2012 
Accepted 21 August 2012 



This final article is available 
for use under the terms of 
the Creative Commons 
Attribution Non-Commercial 
2.0 Licence; see 
http://bmjopen.bmj.com 



1 Department of Statistics, 
University of Namibia, 
Windhoek, Namibia 
2 Division of Health Sciences, 
University of Warwick, 
Warwick Medical School, 
Coventry, Warks, UK 
3 Malaria Public Health and 
Epidemiology Group, Centre 
for Geographic Medicine, 
KEMRI-University of Oxford- 
Wellcome Trust Collaborative 
Programme, Nairobi, Kenya 



Correspondence to 

Dr Kandala Ngianga-Bakwin; 
N-B.Kandala@warwick.ac.uk 



ABSTRACT 

Objectives: To estimate and quantify childhood 
mortality, its spatial correlates and the impact of 
potential correlates using recent census data from 
three sub-Saharan African countries (Rwanda, Senegal 
and Uganda), where evidence is lacking. 
Design: Cross-sectional. 
Setting: Nation-wide census samples from three 
African countries participating in the 2010 African 
Census round. All three countries have conducted 
recent censuses and have information on mortality of 
children under 5 years. 

Participants: 111 288 children under the age of 

5 years in three countries. 

Primary and secondary outcome measures: 

Under-five mortality was assessed alongside potential 
correlates including geographical location (where 
children live), and environmental, bio-demographic and 
socioeconomic variables. 

Results: Multivariate analysis indicates that in all three 
countries the overall risk of child death in the first 
5 years of life has decreased in recent years (Rwanda: 
HR=0.04, 95% CI 0.02 to 0.09; Senegal: HR=0.02 
(95% CI 0.02 to 0.05); Uganda: HR=0.011 (95% CI 
0.006 to 0.018). In Rwanda, lower deaths were 
associated with living in urban areas (0.79, 0.73, 
0.83), children with living mother (HR=0.16, 95% CI 
0.15 to 0.17) or living father (HR=0.38, 95% CI 0.36 
to 0.39). Higher death was associated with male 
children (HR=1.06, 95% CI 1.02 to 1.08) and Christian 
children (HR=1.14, 95% CI 1.05 to 1.27). Children less 
than 1 year were associated with higher risk of death 
compared to older children in the three countries. Also, 
there were significant spatial variations showing 
inequalities in children mortality by geographic 
location. In Uganda, for example, areas of high risk are 
in the south-west and north-west and Kampala district 
showed a significantly reduced risk. 
Conclusions: We provide clear evidence of 
considerable geographical variation of under-five 
mortality which is unexplained by factors considered in 
the data. The resulting under-five mortality maps can 
be used as a practical tool for monitoring progress 
within countries for the Millennium Development Goal 
4 to reduce under-five mortality in half by 2015. 



■ 
■ 



Article focus 

■ Census and household data contain small-area geo- 
graphical information, such as the residence of a 
child at the time of death. The impact of such 
spatial effects on mortality is of substantive interest. 
Mortality data in the census and household are also 
subject to recall bias and heaping effects. The event 
of deaths is also censored at the time of the survey. 
We use recent statistical techniques to account for 
the survival nature of the data in the analysis. 
We use data from recent census/household data from 
three sub-Saharan Africa (Rwanda, Senegal and 
Uganda) to investigate the importance of country- 
specific geographical factors on under-five mortality. 

Key messages 

■ Our results provide clear evidence of considerable 
geographical inequalities of under-five mortality that 
is unexplained by socio-economic factors consid- 
ered in the data. 

■ Our findings indicate that public health interventions 
and health promotion to reduce under-five mortality, 
should take into account both individual and area vari- 
ation to account for the diversity of settings. 

■ Planning and intervention measures could have dif- 
ferent outcomes in terms of effectiveness in areas 
with a high degree of variability. Homogeneous 
policy intervention strategies may not give the 
required outcomes as suggested by large significant 
inequalities in of under-five mortality within and 
between countries in our study. 

Strengths and limitations of this study 

■ Our study seems the only one that has attempted to 
investigate geographical inequalities of under-five 
mortality beyond individual and household factors 
using merged census and household data from 
sub-Saharan Africa countries. 

■ The major strength is the use of census and nationally 
representative household survey to investigate and 
explain district-level inequalities in under-five mortality 
using a novel approach that accounts simultaneously 
for individual, household and area factors. 

■ The major limitation of this study is the cross- 
sectional nature of the data, which does not permit 
one to draw causal association between under-five 
mortality and the associated spatial effects including 
individual and household factors. 
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INTRODUCTION 

There have been considerable gains in child survival in 
the world over the past 10 years. Recent reports on the 
'State of the World's Children' indicate an overall 
decline in child mortality from 100/1000 children in 
1999 to 72/1000 children in 2010. 1 A number of coun- 
tries in North Africa, Eastern Europe, south-east Asia 
and Latin America have reduced under-five child mor- 
tality by half in the period between 1990 and 2010. 1 In 
contrast, countries in sub-Saharan Africa (SSA) have 
remarkably high rates and were ranked the top worst 
performers in reduction of child mortality, with very few 
making progress and the majority experiencing no 
change or a reversal in gains made some 10 or so years 
previously. In Ethiopia, Malawi and Namibia the decline 
in under-five child mortality has been substantial despite 
meagre resources, while in other SSA countries it has 
remained the same. A case in point is DR Congo which 
posted an under-five mortality of 199/1000 children in 
1999 and the same again in 2009. In countries such as 
Chad, there has been an increase from 201/1000 chil- 
dren born in 1999 to 209/1000 children in 2009. In 
short, even with known solutions and international assist- 
ance, the transition from high mortality to low mortality 
is highly uneven in the SSA region. 

Several studies have shown that child survival in the 
first 5 years of life is influenced by a myriad of risk 
factors. For instance, Becher et af quantified the effect 
of risk factors for childhood mortality in a typical rural 
setting of Burkina Faso. They performed a survival ana- 
lysis of births within a population from a demographic 
surveillance system in 39 villages. In another study in 
rural Tanzania, Armstrong-Schellenberg et at conducted 
a community-based nested case-control study of post- 
neonatal deaths in children under 5 years, in which they 
investigated demographic and socio-economic factors, 
health-seeking behaviour, the household environment 
including accessibility to healthcare and individual child 
care factors. A similar population-based case-control 
study was carried out to investigate potential risk factors 
for postneonatal and child mortality in northern 
Ghana. 4 Child mortality demonstrated gender-based dis- 
parities, 5 varied with socio-economic inequalities 2 and 
was influenced by variation in coverage of interven- 
tions. 4 At times, living in either urban or rural areas can 
disadvantage under-five children's health. 6 The general 
picture is that major causes of childhood mortality, sum- 
marised as disease and malnutrition, are exacerbated by 
socio-economic differences and varied intervention 
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coverage, and these risk factors apply at both individual 
and community levels. 8 

Many of these studies have been conducted at subna- 
tional or national level. Our search showed that relatively 
few studies have considered between country or cross- 
national analyses of childhood health and associated 
determinants. Magadi 8 examined the risk factors of mal- 
nutrition among children whose mothers are infected 



with HIV in SSA. She applied a multilevel logistic regres- 
sion to the Demographic and Health Survey (DHS) data 
from 18 countries for the period 2003-2008. Another 
across-countries study was conducted by Kandala et al 9 
in which they considered geographical and socio- 
economic determinants of child undernutrition in 
Malawi, Zambia and Tanzania. This identified regional 
patterns which transcend national boundaries. In a 
similar study, Sherbinin 10 reported on biophysical and 
geographical correlates of child malnutrition in Africa. 
Wang, 11 using data from 60 countries, explored the 
global pattern of child mortality and investigated the 
determinants both at national and subnational level. 

While tackling the issue of determinants of child mor- 
tality, it is common that sample survey data have been 
used. Only a few studies have considered census data, 
mostly to analyse demographic indicators such as 
fertility, yet with very limited statistical modelling of 
childhood mortality. 12 Census data provide a large cross- 
sectional database that would allow investigation of the 
association between mortality or health outcomes and 
risk factors. Because census data are a complete enumer- 
ation of all individuals in a country, the statistical analysis 
is likely to have more power than data derived from a 
survey. Further the censuses provide a picture of the 
country at any given time, therefore allowing a better 
understanding of risk factors, critical in explaining varia- 
tions and crucial for implementing interventions in 
child mortality, thus. 



OVERVIEW OF THE ANALYSIS OF MORTALITY DATA 

A number of statistical models have been proposed 
when analysing the risk of child mortality in the first 
5 years of life and its determinants. Most popular have 
been logistic regression models which assume child sur- 
vival as a binary response (either the child lived beyond 
5 years or died before the fifth birthday). In such 
models one estimates the probability of a child surviving 
and can include risk factors. The coefficients of risk 
factors can be interpreted as ORs. These models, never- 
theless, ignore the time to event (death), and therefore 
fail to capture exposure to the risk of dying or conceal 
the evolution of the subject's state over time. 13 More 
appropriately, survival models can be used to analyse the 
hazards of child survival. Both logistic and survival ana- 
lysis can be implemented within the basic generalised 
linear models (GLM) framework. 

Research on survival analysis in demography and 
related fields has increased since the seminal work by 
Cox in the 1970s, 14 with application in child mortality or 
survival appearing in the 1990s and 2000s, 15 " 17 involving 
both standard proportional hazard models 12 and 
complex models. 18 Precursor to the use of Cox regres- 
sion, life tables have been used to estimate the probabil- 
ities of survival of a given cohort. 19 The Cox regression 
and recent modern survival techniques, in contrast, 
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permit effects of risk factors (determinants) to be esti- 
mated, which is critical for designing interventions. Cox 
and other survival regression models have other advan- 
tages including analysis of censored and truncated 
response data, and time-varying effects. 20 

Two important extensions in survival regression models 
that have received considerable attention recently are the 
inclusion of random effects and flexible modelling 
through semiparametric and nonparametric approach. 21 
Such analyses have an added advantage compared to 
ordinary GLM. Models that incorporate random effects 
are commonly called generalised linear mixed models 
(GLMM), and those that account for non-linearity are 
referred to as generalised additive models (GAM) and 
when extended to include random effects they are known 
as generalised additive mixed models (GAMM). 

The inclusion of random effects permits modelling of 
unmeasured and unobserved contextual factors in the 
models. These may act at family, community, district, 
regional or national levels since the underlying causes of 
neonatal mortality are multisectoral and interwoven. 8 17 
Those operating at individual, family, community and 
regional levels can have a direct or intermediary effect 
on the outcome. Regionally, expenditure on health ser- 
vices and cultural differences can also affect the survival 
status of children. In survival models, unobserved factors 
are considered as frailties, which adjust for hierarchical 
clustering of survival data. In essence, frailties are group 
specific factors acting on child survival, which together 
with individual factors may protect or accelerate death. 22 
Studies on child mortality by Sastry 15 employed nested 
frailty models to analyse child survival data clustered at 
community and family levels. Another study by Bolstad 
and Man da 1 6 showed significant heterogeneity at com- 
munity level, which can be attributed to differential 
availability of resources at community level. 

Recent studies have assumed that such unobserved 
factors vary spatially to give spatial frailty survival models. 
Banerjee et af 3 developed parametric frailty specifications 
based on both areal (lattice) and on point-referenced 
(geostatistical) spatial models, and compared them with 
traditional independent and identically distributed frailty 
and non-frailty approaches under a Weibull baseline 
hazard function in the context of county-level infant mor- 
tality data. Bastos and Gamerman 24 used a dynamic sur- 
vival model with spatial frailty to handle time-varying 
covariates in the presence of spatial effects. In another 
study, Li and Ryan 25 used a semiparametric frailty model 
to analyse spatial survival models. In the above studies, a 
single modelling framework was used to model the spatial 
and time-varying effects simultaneously. Moreover, some of 
the risk factors may be geographically varying, leading 
child mortality to vary in space, 6 18 26 with well documen- 
ted space-time interactions. 27 In another study, Kandala 
et a? applied a geo-additive model where spatial, non- 
linear and fixed effects were simultaneously modelled in a 
single framework. Again, none of these considered the 
census data. 



The main objective of this study is to analyse small-scale 
geographical variability in under-five mortality in the 
sub-Saharan region, by applying existing spatial statistical 
methodology. Our aim is to extend the standard Cox 
regression model to a random-effects model to permit 
spatial clustering and heterogeneity using census data 
from a number of countries. Specifically, we apply GLMM 
with spatially correlated random effects proposed by 
Hennerfeind et al, 20 and used it to analyse factors asso- 
ciated with child survival in the first 5 years of life. This 
modelling approach falls within a group termed structured 
additive regression (STAR) models, introduced by 
Kamman and Wand. 28 STAR models are a comprehensive 
class of models which permit simultaneous estimation of 
nonlinear effects of continuous covariates, with both spa- 
tially unstructured and structured components, together 
with the usual fixed effects in the predictor. 29 



Table 1 Summary of selected covariates used in the 
model 



Covariate 


Rwanda 


Senegal 


Uganda 


Residence 












Rural 


38154 


(ft 1^ 


23 490 (10.6) 


PQ 41 0 


(A A\ 


Urban 


6215 




10 036 (6.5) 






Religion 












Christian 


40 650 


(7 9) 


977 (6.9) 


28 567 


(4 4) 


Muslim 


837 


(8.2) 


32 378 (9.4) 


4167 


(4.0) 


Others 


4369 


(7.1) 


171 (10.5) 


689 


(2.7) 


Age of the child 










<1 year 


10516 


(16.1) 


4806 (15.5) 


9440 


(10.1) 


1 year 


7710 


(11.1) 


5276 (16.1) 


9307 


(1.8) 


2 years 


7853 


(5.0) 


6278 (1 1 .5) 


6937 


(0-8) 


3 years 


6128 


(3-7) 


5628 (6.4) 


3613 


(0.4) 


4 years 


5701 


(2.6) 


5780 (4.5) 


2303 


(0-5) 


5 years 


6461 


(2.1) 


5811 (3.4) 


1553 


(9.8) 


Sex of child 












Male 


22189 


(8.5) 


16 722 (8.2) 


17 056 


(4.4) 


Female 


22146 


(6.9) 


16 793 (10.2) 


16 382 


(4.2) 


Electronic Index 










Least 


762 


(3-0) 


10 004 (9.8) 


2373 


(2.3) 


Less 


22 250 


(8.7) 


3261 (10.8) 


15 881 


(4.9) 


Medium 


619 


(2.9) 


8789 (1 1 .2) 


173 


(1.7) 


More 


19249 


(7.3) 


4753 (8.5) 


14104 


(4.1) 


Most 


1489 


(5.0) 


6719 (6.2) 


907 


(3-0) 


Shelter index 












Lowest 


8168 


(7.0) 


6688 (5.4) 


6644 


(3-2) 


Low 


10 904 


(7.3) 


6785 (9.2) 


6647 


(3-9) 


Medium 


7706 


(8.3) 


6743 (10.5) 


7461 


(4.0) 


High 


8434 


(8.9) 


6162 (10.9) 


5571 


(5.7) 


Highest 


9157 


(7.5) 


7248 (10.8) 


7169 


(4.9) 


Mother alive 












Yes 


39 687 


(1.0) 




25 501 


(4-1) 


No 


1247 


(10.1) 




7937 


(5.1) 


Father alive 












Yes 


37133 


(0-7) 




19 637 


(4.0) 


No 


3200 


(8.5) 




13 801 


(4.7) 



Given in the table are the counts (and proportion dead) across 
covariates, N(%). 
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The spatial analysis can be approached in various ways, 
one based on the stationary Gaussian random fields, 
which applies when the place of residence is known 
exactly, given by geographical x— y coordinates, and the 
principle originates from geostatistics. 23 These can also be 
interpreted as two-dimensional surface smoothers based 
on radial basis functions, and have been employed by 
Kammann and Wand 28 to model the spatial component in 
Gaussian regression models. Another option is to use two- 
dimensional P-splines described in more detail in Brezger 
et al. 30 The advantage of these approaches is that they 
allow prediction of risk for locations where there are no 
data, thus allowing us to quantify small-scale variability. If 
observations are aggregated in geographical regions, 
spatial effects can be estimated using the Markov random 
field (MRF) approach, widely used in disease mapping. 23 
Modelling and inference can use a fully Bayesian 
approach. However, the empirical Bayesian approach via 
penalised likelihood techniques is also possible. 29 

A detailed description of the statistical methodology 
used is in the appendix. In this study, we provide simula- 
tion studies and apply these techniques to the 2000- 
2010 Census data from selected SSA countries whose 
under-five death rates are at relatively similar ranking. 
Rwanda and Uganda are ranked 31st and 27th in the 
world respectively with regard to improvement in 
under-five mortality, while Senegal is ranked at 42nd. 
Rwanda and Uganda have an estimated under-five mor- 
tality of 91 and 99/1000 live births respectively, as of 
2010, reduced from 163 and 175/1000, respectively, esti- 
mated in 1990. Senegal has a rate of 75, estimate of 



2010 compared to 139, estimated in 1990. Infant mortal- 
ity rates, estimated at 2010, are 59, 63 and 50 for 
Rwanda, Uganda and Senegal, respectively. More details 
on child survival can be found at UNICEF. 1 

DATA AND METHODS 
Data 

Data were analysed from three countries, Rwanda, Uganda 
and Senegal purposively selected because relevant census 
variables were available to carry out survival analyses for 
the first 5 years of life (under-five mortality: 5q0). 

For Rwanda, we analysed census 2001 data, while for 
Uganda and Senegal we used data from the 2002 round of 
the census. For all countries, data analysed were limited to 
an approximate 1% random sample of the census data, 
mainly due the complexity of the models fitted. Total 
samples considered were 44 000 for Rwanda, 33 500 for 
Senegal and 33 400 for Uganda. While the censuses have 
limited numbers of variables, each child record, derived 
from self-reported information given by the household 
head, consisted of age at time of death, and whether a 
child was alive or dead at the time of census, as well as 
other covariates which may influence child mortality. Our 
analysis was restricted to children below the age of 5 years. 
Since no information was available as to whether the child 
was alive at the census prior to the current enumeration, 
the survival information was right-censored. Factors influ- 
encing child mortality varied from country to country, and 
questions were not uniform across the three countries. To 
enable comparability of the results we selected similar 



Table 2 


Model comparison values based on Deviance Information Criterion (DIC) for the models 






Model 


Description 


D 


Pd 


DIC 


Rwanda 










M0 


Province (RR) 


24 060.2 


12.4 


24 084.9 


M1a 


Fixed effects only 


10 256.7 


21.9 


10 300.7 


M1b 


Fixed+ Province (RE) 


10184.8 


29.3 


10 240.3 


M2a 


Unstructured random effects (District)P n 


19911.4 


22.4 


19956.2 


M2b 


Structured spatial effects (District) only 


19212.3 


25.7 


19 963.5 


M3a 


Structured effects (District)+ Unstructured (Province) 


19 870.3 


24.9 


19 920.3 


M3b 


Fixed 1 structured effects (District) Unstructured (Province) 


7667.2 


44.9 


7756.1 


Senegal 










M0 


Province (RE) 


20 536.7 


10.4 


20 557.5 


M1a 


Fixed effects only 


19 361.8 


28.4 


19418.7 


M1b 


Fixed+Province (RE) 


19 362.7 


28.4 


19419.1 


M2a 


Unstructured random effects (District) 


15 284.5 


54.2 


15 356.8 


M3a 


Structured effects (District)+ unstructured (Province) 


15 284.6 


56.4 


15 361.5 


M3b 


Fixed+structured effects (District) — unstructured (Province) 


14 574.1.6 


68.6 


14711.1 


Uganda 










M0 


Province (RE) 


11 681.1 


63.3 


1 1 807.5 


M1a 


Fixed effects only 


10 613.0 


20.1 


10 653.1 


M1b 


Fixed 1 Province (RE) 


10 425.9 


76.8 


10 606.6 


M2a 


Unstructured random effects (District) 


1 1 735.7 


30.0 


1 1 795.7 


M2d 


Fixed effects+struclured spatial effects (District) 


10 484.6 


49.5 


10 588.5 


M3a 


Structured effects (District)+ unstructured (Province) 


11 691.7 


48.5 


1 1 788.7 


M3b 


Fixed+structured effects (District) — unstructured (Province) 


10 468.9 


59.6 


10 588.1 
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covariates. We considered the following individual socio- 
demographic variables in the analysis as determinants of 
child mortality: region and place of residence, education 
level of mother, occupation of father, number of 
under-five children in household, whether the previous 
child died, whether the father or mother was alive, and 
ownership of dwelling unit. We constructed two indices: 

(1) shelter, for the following characteristics: specifically the 
type of dwelling unit (permanent, semipermanent, trad- 
itional), type of roof, wall and floor and type of toilet; 

(2) electronics for the following electronic assets: radio, 
cell phone, television, phone, iron and fridge. For spatial 
analysis, we used provinces and districts as units of analysis. 
Table 1 gives a summary of the variables used. 

Statistical analysis 

We examined spatial variation in under-five mortality 
with a flexible geo-additive semiparametric mixed model 
while simultaneously controlling for spatial dependence 
and possibly non-linear effects of covariates within a sim- 
ultaneous, coherent regression framework. Individual 
data records were constructed for children in each 
country. 

A more general spatial Cox regression model, accord- 
ing to Hennerfeind et af° which captures a wide range 
of issues including spatial frailties was adapted and 
applied to determine factors associated with the risk of 
early childhood mortality. We applied a fully Bayesian 
approach based on Markov priors and using Markov 
Chain Monte Carlo (MCMC) techniques for inference 
and model checking. For model choice, we used the 
Deviance Information Criterion (DIC) developed as a 
measure of fit and model complexity. 

The analysis was carried out using V.1.4 of the BayesX 
software package, 30 which permits Bayesian inference 
based on MCMC simulation techniques. For all models, 
25 000 iterations were run with the initial 5 000 dis- 
carded and every 20th sample stored to give a final 
sample of 1 000 for parameter estimation. Convergence 
was evaluated by inspecting trace and autocorrelation 
plots of samples for each chain, as well as through 
numerical summaries such as the \ // R diagnostic statistic 
of Brooks and Gelman. 31 After 5000 iterations, all para- 
meters showed signs of convergence in the trace plots. 
The values of \/H also quickly approached 1 and were 
all below the value of 1.12, which indicated convergence 
of both pooled and within-interval widths to stability. 
Statistical methods have also been discussed in more 
detail in the appendix. 

RESULTS 

Table 1 gives a summary of the selected covariates across 
the three countries. There are evident disparities by 
place of residence for all three countries, with rural chil- 
dren slightly disadvantaged in mortality. The same 
picture was observed by age, with children less than 



1 year disadvantaged compared to older children, with 
the proportion dying diminishing with increasing age. 
Children without a living mother or father were likely to 
die in their first 5 years of life. However, there was no 
clear pattern in relation to the shelter or electronics 
indices, or with religion or sex of the child. Similar 
results were obtained in the bivariate analyses presented 
in tables 3, 4, 5. 

In table 2, model selection values are given for the 
discrete-time survival models with different specifications 
of the covariates for the three countries. For all 
the three datasets the models which combined fixed 
and random effects were better than those that did not 
combine effects, indicating the importance of both sets 
of factors at explaining child survival. For Rwanda data, 
the best model was model M3b, which combines fixed 
effects at individual and household levels and random 
effects at district and provincial levels. The DIC for 
model M3b was 7756.1 compared to the nearest model, 
Mlb, with DIC=10240.3. Moving to Senegal data, again 
the model that combined fixed and random effects 



Table 3 Fixed effects for Rwanda child survival 


Variable 


Bivariate analysis 


Multivariate analysis 




HR (95% CI) 


HR (95% CI) 


Intercept 




u.uh- ^u.u^i 10 u.uyj 


Place of residence 




Urban 


0.67 (0.60 to 0.75) 


0.79 (0.73 to 0.83) 


Rural 


1.00 


1.00 


Dwelling ownership 




Yes 


0.99 (0.89 to 1.11) 




No 


1.00 




Religion 






Christian 


1.19 (1.03 to 1.38) 


1.14(1.05 to 1.27) 


Muslim 


1 .28 (0.97 to 1 .68) 


0.95 (0.81 to 1.12) 


Others 


1.00 


1.00 


Sex of child 






Male 


1.20 (1.12 to 1.28) 


1.06 (1.02 to 1.08) 


Female 


1.00 


1.00 


Electronic index 




Least 


1.00 


1.00 


Less 


2.96 (1.96 to 4.46) 


1.29 (1.16 to 1.51) 


Medium 


0.74 (0.38 to 1 .44) 


0.64 (0.46 to 0.89) 


More 


2.45 (1.62 to 3.69) 


1.31 (1.16 to 1.51) 


Most 


1.64 (1.03 to 2.62) 


1.15 (0.99 to 1.35) 


Shelter index 




Lowest 


1.00 


1.00 


Low 


1.06 (0.95 to 1.18) 


0.96 (0.88 to 1.01) 


Medium 


1.20 (1.07 to 1.34) 


0.98 (0.92 to 1.05) 


High 


1.28 (1.15 to 1.43) 


1.05 (0.98 to 1.13) 


Highest 


1 .09 (0.98 to 1 .22) 


1.09 (1.03 to 1.19) 


Mother alive 






Yes 


0.012 (0.011 to 0.013) 


0.16 (0.15 to 0.17) 


No 


1.00 


1.00 


Father alive 






Yes 


0.016 (0.014 to 0.018) 


0.38 (0.36 to 0.39) 


No 


1.00 


1.00 
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Table 4 Fixed effects for Senegal child survival 




Bivariate analysis 


Multivariate analysis 


Variable 


HR (95% CI) 


HR (95% CI) 


Intercept 




0.024 (0.015 to 0.045) 


Place of residence 




Urban 


0.63 (0.58 to 0.69) 


1 .01 (0.97 to 1 .05) 


Rural 


1.00 


0 


Dwelling ownership 




Yes 


1 .55 (1 .40 to 1 .73) 


1 .19 (1 .09 to 1 .28) 


No 


1.00 


0 


Religion 






Christian 


0.61 (0.36 to 1.03) 


0.84 (0.56 to 1.18) 


Muslim 


0.85 90.53 to 1 .35) 


0.99 (0.71 to 1 .42) 


Others 


1.00 


1 .00 


Sex of child 






Male 


0.78 (0.73 to 0.84) 


0.88 (0.85 to 0.91) 


Female 


1.00 


1 .00 


Electronic index 




Least 


1.00 


1.00 


Less 


1.12 (0.99 to 1.27) 


1 .04 (O.yy to 1.12) 


Medium 


1.16 (1.06 to 1.27) 


1 .02 (0.98 to 1 .06) 


More 


0.90 (0.80 to 1.01) 


1 .001 (0.94 TO 1 .05) 


Most 


0.66 (0.59 to 0.73) 


n QQ (C\ Q7 \-r\ A C\A\ 

u.yo \\j.oi to 1 .in ) 


Shelter index 






Lowest 


1.00 


1.00 


Low 


1 .63 (1 .43 to 1 .86) 


1 .01 (0.95 to 1 .07) 


Medium 


1.96 (1.69 to 2.18) 


1.01 (0.95 to 1.06) 


High 


1.94 (1.71 to 2.21) 


0.99 (0.93 to 1.04) 


Highest 


1.95 (1.72 to 2.21) 


0.96 (0.96 to 1.09) 





produced the best fit (model M3b). Model M3b had a 
DIC=147ll.l which is smaller compared to model M3a 
(DIC=15361.5; table 2). Similar results are obtained for 
the Uganda data with model M3b emerging as best fit, 
although model M2d was indistinguishable (see table 2). 

Tables 3, 4, 5 present estimates of fixed risk factors 
resulting from the models with the best fit. For Rwanda 
(table 3), there was an overall decrease of risk of a child 
dying in the first 5 years of life (HR=0.04, 95% CI 0.02 
to 0.09). Children in urban areas were less likely to die 
than those in rural areas (HR=0.79, 95% CI 0.73 to 
0.83). The relationship of child dying and household 
electronic assets was nonlinear. At level 2 compared to 
level 1, the risk was higher with HR=1.29 (95% CI 1.16 
to 1.51), while at level 3 we observed a lower risk with 
HR=0.64 (95% CI 0.46 to 0.89) and this is reversed at 
level 4 with HR=1.31, 95% CI: 1.16 to 1.51. For the 
shelter index, the risk was reduced at lower levels and 
increased at higher levels of the index, although this 
relationship was not significant at p<0.05. It is interesting 
to note that a child with a living mother and father had 
a reduced risk of dying (table 3) . Children up to 1 year 
of age were at increased hazard relative to those aged 
5 years or older. At less than 1 year of age the log hazard 
was 0.77 (95% CI: 0.71 to 0.83), while at 1 year the log 
hazard was 0.48 (95%CI 0.42 to 0.55). As age increased, 
the hazard reduced. For example, those aged 2-4 years 
the log hazard was -0.02, -0.22 and -0.46, respectively. 



Table 5 Fixed effects for Uganda child survival 






Bivariate analysis 


Multivariate analysis 


Variable 


HRs (95% CI) 


HRs (95% CI) 


Intercept 




n m 1 (c\ nn^ tr* n m i\ 

U.LM \ (U.UUO TO U.LH / 


Place of residence 




Urban 


0.68 (0.57 to 0.81) 


1 C\A (C\ QO \r\ A 1 ~7\ 


Rural 


1.00 


\ .uu 


Employed 






Yes 


0.94 (0.85 to 1 .05) 


i Hi IC\ QA A A Q\ 

i .in (u.yi to i .\ o) 


No 


1.00 




Under 5 






None 


3.74 (1.41 to 10.40) 


£ t. I O TO O.O I ) 


1-3 


0.92 (0.33 to 2.41) 




children 






>4 


1.00 


\ .UU 


Married 






Yes 


1.14 (1.04 to 1.23) 


c..\\j \ I . /o IU c-.t+c.) 


No 


1.00 


a nn 
\ .UU 


Polygamy 






Yes 


1.003 (0.84 to 1.19) 


1 n9 (H QR tn 1 1 9^ 


No 


1.00 




Sex of last birth 




Male 


0.98 (0.89 to 1.08) 


1 n? (n QP tn 1 1 7^ 


Female 


1.00 


1 nn 


Education 






None 


1.26 (1.10 to 1.45) 


1 48 (A 98 tn 1 7n^ 


Lower 


1.02 (0.89 to 1.16) 


1 91 M CiR tn 1 AR\ 
I .c. I ^ I .UO IU I .H-Oj 


primary 






Upper 


0.66 (0.54 to 0.81) 


n qa tc\ vn tr» n qr\ 
U.o4 (U./U TO U.yo; 


primary 








0.38 (0.32 to 0.62) 


n R~7 IC\ Q£ tr» n P1 \ 
U.O/ ^U.OD TO U.o I ) 


Secondary 






Tertiary 


1.00 


a nn 

I .UU 


Electronic index 




Least 


1.00 


1 nn 
\ .uu 


Less 


2.36 (1.75 to 3.10) 


1 R9K IA AR \c\ O Afc\ 
I .OO \ I . I O IU c.. I O) 


Medium 


0.84 (0.36 to 2.65) 


n R1 (C\ A A tn 1 97\ 
U.OO ^U. I D TO I .c. 1 ) 


More 


1 .99 (1 .51 to 2.64) 


A Q£ (A m tn 1 Qn\ 

i .oo \ i .u i io i .yuj 


Most 


1.27 (0.80 to 2.16) 


1 1 4 10 84 tn 1 GR) 


Shelter index 






Lowest 


1.00 


1.00 


Low 


1.27 (1.06 to 1.52) 


0.92 (0.83 to 1.03) 


Medium 


1.31 (1.09 to 1.53) 


0.88 (0.81 to 0.99) 


High 


1 .97 (1 .67 to 2.36) 


1.32 (1.15 to 1.51) 


Highest 


1 .68 (1 .42 to 1 .99) 


1.12 (0.97 to 1.23) 



The spatial variability of risk of dying is shown in 
figure 1, with log hazard ranging between —8.23 and 
3.14. There were a number of areas that were associated 
with increased risk of death compared to the overall 
mean. These areas are identified by the right map, with 
a white colour and appear in the south, west and at the 
centre of the country. There are also areas of reduced 
risk shown by a black colour. 

In table 4 we present results for Senegal. Overall the 
risk of death decreases with HR=0.024 (95% CI: 0.015 to 
0.045). The risk significantly varied with ownership of 
dwelling unit, electronic assets, sex and age of the child. 
Ownership of a dwelling unit was associated with 
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Figure 1 Left: structured spatial effects, at district level in Rwanda, of child surviva (model M3b). Shown are the posterior 
means. Right: corresponding posterior probabilities at 80% nominal level, white denotes region regions with strictly positive 
credible intervals, black denotes regions with strictly positive credible intervals and grey depicts regions of non-significant effects. 



increased risk (HR=1.19, 95%CI 1.09 to 1.28) compared 
to those households without a dwelling unit. Male chil- 
dren were more likely to survive the first 5 years com- 
pared to female children (HR=0.88, 95% CI 0.85 to 
0.91). The risk of dying was positively associated with all 
ages, however, this risk decreased with age, ranging from 
2.48 at age less that 1 year to 1.13 at age of 4 compared 
to those aged 5 years or more. For ownership of elec- 
tronic assets, the risk was higher for those at the lowest 
level (level 1) and decreased with increasing electronic 
assets, although the relationship was marginally signifi- 
cant at p<0.1 for levels 2, 3 and 4(results not shown). 



Nevertheless, the results were significant, at p<0.05 for 
the level 5 category when compared with those at level 1 
(HR=0.93, 95% CI 0.87 to 0.99). Turning to the spatial 
distribution of risk in figure 2, there was substantial vari- 
ation, with estimates of log hazard ranging from —3.44 
to 5.87 (left map). The right map defined areas asso- 
ciated with significantly high risk (shaded white) as well 
as those of significantly low risk (black shading). We 
could not identify a clear pattern to the risk by region. 

Results for Ugandan data are given in table 5. Again 
the overall risk of death decreases (HR=0.011, 95% CI 
0.006 to 0.017). Risk factors associated with under-five 




) 4J7QJ [i 



Figure 2 Left: unstructured spatial effects, at district level in Senegal, of child survival (model M3b). Shown are the posterior 
means. Right: corresponding posterior probabilities at 80% nominal level, white denotes regions with strictly negative credible 
intervals, black denotes regions with strictly positive credible intervals and grey depicts regions of nonsignificant effects. 
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Figure 3 Left: structured spatial effects, at district level in Uganda, of child survival (model M3b). Shown are the posterior 
means. Right: corresponding posterior probabilities at 80% nominal level, white denotes regions with strictly negative credible 
intervals, black denotes regions with strictly positive credible intervals and grey depicts regions of non-significant effects. 



mortality were identified to be number of under-five 
children in the household, marital status, education 
level of mother, ownership of electronic assets and 
shelter characteristics. Families with fewer under-five 
children predisposed children to a high mortality risk 
compared to those with 4 or more children (HR=4.18, 
95% CI 2.92 to 5.81), while those with 1-3 children had 
a reduced risk (HR=0.54, 95% CI 0.39 to 0.73). Being 
married also appeared to confer increased risk of a 
child dying compared to those with children of single 
mothers (HR=2.10, 95% CI 1.78 to 2.42). Our results 
showed that education level of the mother matters when 
it comes to child survival. Children with mothers who 
had no formal education or only lower primary educa- 
tion were more likely to die than those with tertiary edu- 
cation (HR=1.48, 95% CI 1.28 to 1.70 and 1.21, 95% CI 
1.05 to 1.45, respectively) .For those with secondary edu- 
cation, the risk was lower relative to those with tertiary 
education (HR=0.57, 95% CI 0.36 to 0.81). In relation 
to electronic assets, the risk was non-linear with increas- 
ing risk at level 2, reduced risk at level 3 and increased 
risk again at levels 4 and 5, compared to level 1 (table 
5). We observed that the risk was lower at levels 2 and 3 
of the shelter index and increased at level 4 and 5 rela- 
tive to level 1. Nevertheless, the only significant differ- 
ence was observed at levels 3 and 4 (1.32 and 1.12, 
respectively) . 

The geographical variation in risk is shown in figure 3. 
Estimates ranged from —0.61 (low risk) to 0.73 (high 
risk). See left plot. However, the significance map (right 



map) indicates that areas of high risk are in the south- 
west and north-west while those of low risk are in the 
north-east and centre-east. Notably Kampala district 
showed a significantly reduced risk. 

The unstructured spatial effects at provincial level 
were also fitted. Figure 4 shows caterpillar plots for the 
three countries at province and county level. No single 
province or county residual was significantly above or 
below zero indicating no difference in risk of death 
between provinces or counties in the three countries. 
However, there was clear variation in the risk of death, 
for example, in Rwanda there are four provinces with an 
estimated lower risk of death while six provinces have an 
estimated risk in the higher direction. For Senegal, 
there were four provinces with a reduced risk, and eight 
with estimated high risk. In Uganda, about a 100 coun- 
ties were estimated to have a lower risk of child mortality, 
while another 70 had a high risk ( figure 4) . 



DISCUSSION AND CONCLUSION 

The central question of this study was to identify risk 
factors associated with child mortality, which go beyond 
individual factors, and extending to include other 
factors such as the geographic location. These factors 
were assumed to be best captured by assuming spatially 
varying processes. In doing so, we applied a novel 
Bayesian framework which permitted estimation of risk 
at individual, household and area level in a unified 
framework. 
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Figure 4 Left: unstructured spatial effects, at province level in Rwanda and Senegal and at county level in Uganda, of child 
survival (model M3b). Shown are the posterior means and corresponding error bars at probabilities at 80% nominal level. 



Our modelling approach can be considered as an 
extension of the generalised linear model and can be 
classified as a spatial generalised linear model. These 
types of models have a complex structure, which is easily 
exploited using the Bayesian approach. 

Despite the complexity of our approach the results 
obtained from our approach are consistent with what 
has been reported previously. For instance, the 
decreased risk of mortality by age is well established in 
SSA. 6 16 However, we found that the degree of associ- 
ation of various factors varied by country. We also 
observed a rural-urban divide in under-five mortality, 
with rural children more likely to die in the first 5 years 
than their urban counterparts. 



The significance of spatial effects is they have an 
important influence on child survival in the three coun- 
tries. These spatial effects may represent many factors, 
and are likely surrogates of factors not captured by the 
census survey instruments. They may include distal 
factors such as access to health care, availability of 
health care centres, reproductive health behaviour, 32 
cultural and religious practices (including nutrition 
habits) specific to certain areas which may either benefit 
children or put them at increased risk of mortality, or 
they may represent other factors such as disease preva- 
lence or cost or quality. Understanding geographical 
variability of mortality is an increasingly important 
research approach. 17 However, this has often been done 
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implicitly using categorical variables to measure geo- 
graphic effects. 31 In our approach, we explicitly intro- 
duced spatial effects and modelled them at the local 
level. 

In our analyses of spatial frailties, we have used the 
traditional approach that extends hierarchical exchange- 
able frailties to incorporate spatial autocorrelation and 
heterogeneity. However, recent developments in the ana- 
lysis of spatially correlated survival data mean that a 
multivariate spatial dependence structure has been pro- 
posed. 33 This is ideal for multiple spatially dependent 
data, or data that arise in spatially arranged settings. 23 34 
Therefore, a conventional conditional auto regressive 
(CAR) model may not be ideal to capture spatial frail- 
ties. Models designed through multiple memberships 
multiple classification, 34 or using a mixture of Polya 
trees, 35 or multivariate CAR 23 offer desirable properties. 
Indeed, further studies which consider the multilevel 
and multivariate structure of survival data are worth 
exploring. 33 36 37 



STRENGTH AND LIMITATIONS 

It should be noted that these samples from census are 
10 times bigger than those used in national surveys like 
DHS. However, there are some limitations in the present 
study that deserve attention. First, census data are cross- 
sectional in nature and the present study does not allow 
establishing temporality and thus causality of the 
observed associations. Given the self-reporting of chil- 
dren death in census, we cannot disregard the likelihood 
that mortality outcomes may be influenced by the 
respondent's recall. In addition, there was limited or 
lack of information for variables such as income and dis- 
eases data in census compared to surveys data, which are 
relevant to mortality. Nevertheless, our findings corrob- 
orate the notion that childhood mortality is an increas- 
ing public health issue in these countries, with evidence 
of considerable spatial variation across different pro- 
vinces in the three countries. 

Another important issue in the use of this data is the 
issue of data quality because of the fact that national 
census in developing countries are prone to incomplete 
or partial reporting of responses. Moreover, the use of 
complex questionnaires inevitably allows scope for 
inconsistent responses to be recorded for different ques- 
tions resulting in a further complication in the assess- 
ment of morality outcomes. 

In summary, the primary objective of this article was to 
illustrate a novel application of a recently developed 
structured additive regression model to analyse census 
data in SSA. The approach is data driven and has rarely 
been applied in the SSA region. Census data is often not 
used, thus this paper exemplifies the use of such techni- 
ques in census data as opposed to survey samples, with 
results providing confirmatory data, and therefore, pro- 
viding more confidence in censuses and national 
surveys. Indeed, the present paper offers incremental 



new information regarding child survival at the individ- 
ual or geographical levels in Africa. Furthermore, the 
results emphasise the fact that there are complex social 
and demographic processes operating in under-five mor- 
tality which can be more clearly understood using 
adequate statistical modelling which analyses the 
outcome of mortality beyond the individual child's risk 
factors and which incorporates distal factors such as the 
area where the child lives at the time of the survey. This 
has implications for these countries in terms of policy 
and planning for the achievement of Millennium 
Development Goal (MDG 4) to reduce under-five mor- 
tality by half by 2015. 
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APPENDIX 

Statistical methodology 

In studying survival of the child, we assume T as time to event or sur- 
vival time with t as the actual realisation. The probability that a survival 
time T is less than or equal to some value t is given by 

t 

F(t) = J f (u)du = Pr(T < t). In our context, F(t) is the cumulative 

o 

probability that a child dies at or before some given time t, thus F(t)=P 
(Child dies at time <t). The instantaneous probability that an event will 
occur at any given small interval is defined as F'(t)=f(t). The proportion 
of children surviving to time t or beyond is derived as S(t)=1-F(t), 
which is also known as the survivor function. An important approach 



is to consider the duration analysis through the hazard rate. The 
hazard rate, which links the survival and failure functions, is of the 

form h(t) = ^ttc or equivalent to h(t) = -logS'(t). The hazard rate, 

S(t) 

unlike the survivor function, describes the risk or event of 'failure' 
given that the individual has survived all along up to point t. 

In the analysis of child mortality, our interest is to answer this ques- 
tion: Given that the child has survived up to month t, what is the likeli- 
hood it will survive in the subsequent months? Further to this, one is 
interested in how the hazard rate varies with respect to some covari- 
ates, for instance, will the hazard be the same for children living in 
urban and rural areas? One way to analyse such data is to use 
Kaplan-Meier survival curves and log-rank test. This is an exploratory 
analysis that permits assessment of any differences in child survival 
by various covariates. 

An alternative model which captures the effect of covariates is to 
use Cox regression models or commonly referred to as the propor- 
tional hazard model. It should be pointed out, however, that various 
statistical models may be constructed, see Box-Steffensmeier and 
Jones 13 for an overview on the topic. In contrast to other data, several 
issues are considered when analysing survival data. Central are cen- 
soring and truncation of survival data, existence of time-varying cov- 
ariates, occurrence of multiple causes of death, whether events 
occurrences were recorded in discrete-time, and the possibility of 
group-risk factors and confounders acting on the hazard. Thus, a 
more general model that incorporates all these issues if they are 
present in the data is needed. 

We propose using a more general Cox model that captures a wide 
range of issues including spatial frailties. Thus, a spatial Cox regres- 
sion model 20 was applied to determine factors associated with the risk 
of early childhood mortality. Assume that Tq is the observed number 
of months lived or the censoring time for jth child in area i. Under 
Cox's model, the hazard function at time T=t is given by 

h(t\frv i j) = ho(t)exp(pv ij ) (1) 

where h 0 (t) is the baseline hazard at time t, and the ps are a vector of 
regression coefficients for the fixed and time-invariant variables^). 
The exponent of a coefficient, that is, exp(p), is interpreted as HR, that 
is, the ratio of instantaneous risks which is assumed to be constant 
over time. The HR compares rates of deaths in one group to some ref- 
erence group, for a categorical variable, and to the mean for a con- 
tinuous variable. 

Since individuals are clustered in geographical regions, group- 
specific random frailty term, ifj h was introduced to augment the Cox 
model, that is, 

h(t\p, v h ifjj) = h 0 (t)exp(f3Vij + ifjj) (2) 

The above model indicated that childhood survival was influenced by 
both individual-specific factors {v;j) and group-specific environmental 
factors if/,. Here it was assumed that the environmental factors were 
approximated by geographical locations. In the case of geographical 
regions, spatially distributed random effects S/were assumed, while for 
the other unstructured heterogeneity a random effect, Uj, was speci- 
fied such that if/ j = Sj + u,. Fitting model (2) assumed a semipara- 
metric additive predictor, which is known as geoadditive survival 
model, 20 

Vij(t) = fo(t) + pvq + u, + Si (3) 

where ^ is the log-additive predictor at time t for child j in area i. The 
term f 0 (t) = log(h 0 (t)) is the log baseline hazard effect at time t. The 
other terms are as defined above. 
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Estimation: fully Bayesian approach 

Prior distributions for covariate effects 

Modelling and inference uses the fully Bayesian approach. In the 
Bayesian formulation, the specification of the proposed model (equa- 
tion 3) is complete by assigning priors to all unknown parameters. For 
the fixed regression parameters, a suitable choice is the diffuse prior, 
that is, p(y)occonst, but a weakly informative Gaussian prior is also 
possible. The baseline hazard effect, f 0 (t), was assigned a penalised 
spline with a second order random walk prior. Similarly, the time and 
continuous covariates were estimated non-parametrically through 
smoothness priors. We use the second-order Gaussian random walk 
prior to allow enough flexibility, while penalising abrupt changes in the 
function, as suggested by Brezger et a/. 30 The prior can be expressed 
in the pairwise difference form as 



p{f\i^)ocexp 



f=3 



-2f f _i + f t - 2 y 



(4) 



where f = ( U , • • • , fp) and is the variance, with diffuse 
priorsfi oc const, f 2 oc const for initial values. 

For the unstructured spatial heterogeneity term, u, is assumed to 
follow an exchangeable Gaussian prior with zero mean and variance, 
7^, that is, Uj ~ A/(0, t^). Finally, for the spatial components s,, we 
assign a MRF prior. 30 This is analogous to random walk models. The 
conditional distribution ofs,, given adjacent areass y , is a univariate 
normal distribution with mean equal the average s y values ofs/s 
neighbouring areas and variance equal to if divided by the number of 
adjacent areas. This leads to a joint density of the form 



(5) 



where i~j denotes that area i is adjacent to j, and assumes that par- 
ameter values si and s y in adjacent areas are similar. The degree of 
similarity is determined by the unknown precision parameter^. 

By writing fj = Zjfij, h = Z k p k , u = ZfP, and s = Z m f3 m , for a well- 
defined design matrix Z and a (possibly high-dimensional) vector of 
regression parameters p, all different priors (equations 4 and 5) can 
be expressed in a general Gaussian form 



(6) 



with an appropriate penalty matrixK,. Its structure depends on the 
covariate and smoothness of the function. In most cases, Kj is rank 
deficient and hence the prior for is improper. For the variances ij 
we ah = Z k p k assume inverse Gamma priors /G(a y , bj), with hyper- 
parameters a y , bj chosen such that this prior is weakly informative. 

Posterior distribution 

Fully Bayesian inference is based on the analysis of posterior distribu- 
tion of the model parameters. In general, the posterior is highly dimen- 
sional and analytically intractable, which makes direct inference 
almost impossible. This problem is circumvented by using MCMC 
simulation techniques, whereby samples are drawn from the full condi- 
tional of parameters given the rest of the data. Under conditional inde- 
pendence assumptions the posterior distribution for the Bernoulli 
model is given by Bayes Theorem 



p(ft t 2 , y\data) oc L{data\p, t 2 , y)p(/3, t 2 , y) 
= L(data\b, t 2 ,g) x jjj p(Pj\rf)p(rf) Jp(y) 



(7) 



where the quantity p(/3, y, t 2 ) is the prior density function, and 
L(data\f3, y, t 2 ) denotes the likelihood of the data. More specifically, 
the posterior is given by 



p(A^y|cfafa)ocjJ 



h(ti)Ho - hm 



i-y« 



xexp 



n 



Pj'KjPj 



1 



^tfy^exp 



x P(r). 



Pi naj)bj j 

where y it is a binary indicator coded 1 if an event occurs and 0 if an 
event does not occur at time t. For updating the full conditionals of 
parameters, we use a hybrid MCMC sampling scheme of the itera- 
tively weighted least squares proposals, developed for GLMM by 
Brezger 30 and Metropolis-Hastings algorithm. Full details are pre- 
sented elsewhere. 20 36 



Data analysis 

A number of models were explored. The first model (MO) explored 
unstructured variation in child i at provincial level k 

MOiTfa = Jconstconst + f unstr {PROVINCE) 

The second set of models estimated fixed effects only (M1a) and 
then we adjusted for unstructured random effects at province level 
(M1b). 

Mia \T) ijk = Xf ijk y 
Mlb : Vijk = x/i jk y + f U nstr{PROVINCE) 



We also investigated geographical variation at district level. We 
fitted both unstructured (M2a)and structured random effects (M2b) 
using districts as variables. 

M2a it] = y const const + f unstr (DISTRICT) 
M2b :rj = y const COnst + f str {DISTRICT) 



The last set of models combined fixed and random effects at district 
and province levels. In model M3a we estimated structured spatial 
effects at district level and unstructured effects at province level, and 
model (M3b) improved model M3a by combining with fixed effects. 

M3a :t? = y const const + f DISTRW (DISTRW) + f PROVRW (PROVRW) 
M3b :t] = y const const + x/ ijk y + +f str (DISTRICT) + f unstr (PROVINCE) 



Model comparison was based on the DIC. This is given by, 
where D is the deviance of the model evaluated at the posterior mean 
of the parameters, and represents the fit of the model to the data. The 
component p D is the effective number of parameters, which assessed 
the complexity of the model. Since small values of D indicate good fit 
while small values of p D indicate a parsimonious model, small values 
of DIC indicate a better model. Models with differences in DIC of <3 
compared with the best model cannot be distinguished, while those 
between 3-7 can be weakly differentiated. 38 
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